Cell Genomics
Top medRxiv preprints most likely to be published in this journal, ranked by match strength.
Show abstract
Understanding genetic factors of complex traits across ancestry groups holds a key to improve the overall health care quality for diverse populations in the United States. In recent years, multiple electronic health record-linked (EHR-linked) biobanks have recruited participants of diverse ancestry backgrounds; these biobanks make it possible to obtain phenome-wide association study (PheWAS) summary statistics on a genome-wide scale for different ancestry groups. Moreover, advancement in bioinfo...
Show abstract
Drug discovery is costly, with billions spent on failed trials. Drugs with genetic support from genome-wide association studies (GWAS) have substantially greater odds of success, but how best to use GWAS to prioritize drug targets remains unclear. We evaluated the performance of GWAS causal gene prioritization methods from the Open Targets consortium by cross-referencing their predictions with drug trial outcomes for 445 diseases. We found that neither expression quantitative trait locus colocal...
Show abstract
Human genetics holds great potential for drug discovery, but challenges in identifying causal genes limit its clinical translation. Pleiotropy, the phenomenon where genetic variants or genes influence multiple traits, has been previously used to explain clinical associations between diseases and propose drug repurposing opportunities. However, its potential to systematically inform drug discovery remains unknown. To this end, we evaluate whether genetic similarity between 178 phenotypes across 1...
Show abstract
Large biobanks, including the Million Veteran Program (MVP), the UK Biobank, and FinnGen, provide genetic association results for more than 1,000,000 individuals for hundreds of phenotypes. To select targets for pharmaceutical development, as well as to improve the understanding of existing targets, we harmonized these studies, and performed two-sample Mendelian Randomization (MR) on 2,003 phenotypes using genetic variants associated with gene expression (derived from GTEx and eQTLGen) and plasm...
Show abstract
Electronic Health Records (EHR)-linked biobanks have emerged as promising tools for precision medicine, enabling the integration of clinical and molecular data for individual risk assessment. Association studies performed in biobank studies can connect common genetic variation to clinical phenotypes, such as through the use of polygenic scores (PGS), which are starting to have utility in aiding clinician decision making. However, while biobanks aggregate large amounts of data effectively for suc...
Show abstract
Whilst the use of single-cell RNA sequencing (scRNA-seq) to understand target biology is well established, its predictive role in increasing the clinical success of therapeutic targets remains underexplored. Inspired by previous work on an association between genetic evidence and clinical success, we used retrospective analysis of known drug target genes to identify potential predictors of target clinical success from scRNA-seq data. We investigated whether successful drug targets are associated...
Show abstract
Asthma, allergic rhinitis, and atopic dermatitis are common, complex traits that are frequently co-morbid and have strong genetic correlation. However, the extent to which genome-wide genetic correlation between traits reflects shared causal variants or risk genes remains unclear. To address this question, we used functional fine-mapping. We generated genomic annotations from primary cells treated with immunomodulatory stimuli, then used these data to identify likely causal variants mediating ge...
Show abstract
BACKGROUND: Therapeutic targets supported by genetic evidence from genome-wide association studies (GWAS) show higher probability of success in clinical trials. GWAS is a powerful approach to identify links between genetic variants and phenotypic variation; however, identifying the genes driving associations identified in GWAS remains challenging. Integration of molecular quantitative trait loci (molQTL) such as expression QTL (eQTL) using mendelian randomization (MR) and colocalization analyses...
Show abstract
IntroductionPrivacy protection is a core principle of genomic research but needs further refinement for high-throughput proteomic platforms. MethodsWe identified independent single nucleotide polymorphism (SNP) quantitative trait loci (pQTL) from COPDGene and Jackson Heart Study (JHS) and then calculated genotype probabilities by protein level for each protein-genotype combination (training). Using the most significant 100 proteins, we applied a naive Bayesian approach to match proteomes to gen...
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWPrecision therapeutics depends on the ability to reason jointly over genes, variants, drugs, diseases, adverse drug reactions (ADRs), and molecular pathways without contaminating evaluation with future knowledge. I present a large-scale pharmacogenomic knowledge graph (PGx-KG) that integrates PharmGKB, ClinVar, SIDER, and Reactome--harmonized to HGNC, RxNorm, MeSH, and ChEBI identifiers--yielding 3,744,727 nodes and 9,645,367 edges across six major relation families. A le...
Show abstract
BackgroundABO blood types have widespread clinical use and robust associations with cardiovascular disease. Many studies determine ABO blood types using tag single nucleotide polymorphisms (tSNPs) to characterize functional variation. However, tSNPs with low linkage disequilibrium (LD) may promote misinference of ABO blood types, particularly in diverse populations. MethodsBibliographic databases were searched for studies (2005-2022) using tSNPs to determine ABO alleles in accordance with PRISM...
Show abstract
Safety-related issues account for approximately 28% of failures in new drug discovery programs. On top of that, many are discovered during post-marketing surveillance, significantly limiting drug utility and application. To proactively address these concerns, we developed a genetics-led strategy leveraging Mendelian Randomization (MR) across large-scale genetic datasets from the Million Veteran Program, FinnGen, and UK Biobank. By mapping genetic variants associated with gene expression and prot...
Show abstract
The Genome Informed Risk Assessment (GIRA) report from eMERGE has become a standard approach to implement genomic precision medicine at scale. Here, we assess GIRAs utility and impact in a health care system independent of eMERGE, focusing on 9 adult conditions using the Penn Medicine Biobank (PMBB, n=48,279). We find a large number of patients - 50.1% (n=24,185) - were deemed by GIRA as high-risk for at least one of the 9 conditions with 30.4% (n=14,676) due to polygenic and/or monogenic risk. ...
Show abstract
BackgroundAsthma pathophysiology varies by age-of-onset and involves diverse immune processes reflected in white blood cell (WBC) subsets. To investigate the genetic architecture of asthma and potential endophenotypes, we analyzed the chr17q12-q21 locus, a robustly replicated asthma locus, across European (EUR), African (AFR), East Asian (EAS), and South Asian (SAS) ancestry groups from the UK Biobank (UKB) and Biobank Japan (BBJ). The largest EUR sample was further stratified by age-ofonset as ...
Show abstract
In genome-wide association studies (GWAS), combining independent case-control cohorts has been successful in increasing power for meta and joint analyses. This success sparked interest in extending this strategy to GWAS of rare and common diseases using existing cases and external controls. However, heterogeneous genotyping data can cause spurious results. To harmonize data, we propose a new method, two-stage imputation (TSIM), where cohorts are imputed separately, merged on intersecting high-qu...
Show abstract
Most current GWAS-eQTL approaches prioritize genes whose mediating effects on complex traits act through cis-regulation, while trans-acting genes remain largely underexplored. Recent perturbational screening technology provides a novel approach to quantifying trans-effects between gene pairs, but its integration with GWAS data remains largely unexamined. We introduce Mr. PEG, a novel framework that integrates perturbational screens, eQTL, and GWAS summary data to identify mediating genes of comp...
Show abstract
IntroductionAsthma is a complex and chronic inflammatory disorder with varying degrees of airway inflammation. It affects [~]235 million people worldwide, and about 8% of the United States population. Unlike single-gene disorders, asthma phenotypes are guided by a highly variable combination of genotypes, making it a complex disease to study computationally. Recently, several independent high-throughput gene expression studies in bioinformatics have identified and proposed numerous molecular dri...
Show abstract
In the development of new drugs, one of the leading causes of late-stage failures are off-target adverse effects, but they are difficult to predict before expensive large-scale clinical trials. Proteomic changes observed in randomized controlled trials (RCTs) and Mendelian randomization estimates of the effects of these changes can provide valuable evidence about the likely effects of drugs on health outcomes. We provide proof of principle for this approach using data from the ILLUMINATE trial o...
Show abstract
BackgroundGenetic control of gene expression in asthma-related tissues is not well-characterized, particularly for African-ancestry populations, limiting advancement in our understanding of the increased prevalence and severity of asthma in those populations. ObjectiveTo create novel transcriptome prediction models for asthma tissues (nasal epithelium and CD4+ T cells) and apply them in transcriptome-wide association study (TWAS) to discover candidate asthma genes. MethodsWe developed and vali...
Show abstract
While respiratory diseases such as COPD and asthma share many risk factors, most studies investigate them in insolation and in predominantly European ancestry populations. Here, we conducted the most powerful multi-trait and -ancestry genetic analysis of respiratory diseases and auxiliary traits to date. Our approach improves the power of genetic discovery across traits and ancestries, identifying 44 novel loci associated with lung function in individuals of East Asian ancestry. Using these resu...